1 Introduction
The concept behind the DEprot (Differential Expression proteomics) is to provide a toolkit that allows for the normalization, imputation and analyses of the differential protein expression in proteomics data. The data are assumed to be LFQ (label-free quantitation) values.
1.1 Citation
If you use this package, please cite:
Citation
“No publication associated yet et al., XYZ (2123).
doi: XYZ/XYZ
citation("DEprot")
>
> To cite package 'DEprot' in publications use:
>
> Sebastian Gregoricchio (NA). DEprot: An R-package for proteomics
> differential analyses. R package version 0.1.0.
> https://sebastian-gregoricchio.github.io/DEprot/
> https://github.com/sebastian-gregoricchio/DEprot/
> https://sebastian-gregoricchio.github.io/
>
> A BibTeX entry for LaTeX users is
>
> @Manual{,
> title = {DEprot: An R-package for proteomics differential analyses},
> author = {Sebastian Gregoricchio},
> note = {R package version 0.1.0},
> url = {https://sebastian-gregoricchio.github.io/DEprot/
> https://github.com/sebastian-gregoricchio/DEprot/
> https://sebastian-gregoricchio.github.io/},
> }
>
> ATTENTION: This citation information has been auto-generated from the
> package DESCRIPTION file and may need manual editing, see
> 'help("citation")'.2 Loading the data
The package starting point is the building of a DEprot
object. This class of objects is specific to this package and requires
at least two elements:
- counts table: this must be a matrix in which the column names represent the samples while the rown ames identify the proteins. The values in the matrix are assumed to be LFQ (label-free quantitation) values, either in linear or log-transformed format.
- metadata: this must be a data.frame containing at least one
column names
column.idwhich values correspond to the column names of the counts_table. Any other additional column can be added (cell lines, treatment, condition, timing, etc) and used to define groups for differential and quality control (PCA and correlation) analyses.
In the next paragraph we will refer to a dataset (pre-loaded in
DEprot), in which a breast cancer (BCa) cell line was
cultured in either hormone-deprived media or in full media (FBS). If
cultered in hormone-deprived condition, it was then treated for 6 hrs
with either \(\beta\)-estradiol (E2) or
vehicle (DMSO). 4 biological replicates have been analyzed. Hence, the
datasets consists of 1 cell lines x 3 conditions x 4 replicates, for a
total of 12 sample. Proteins and samples have been “anonymized”.
2.1 Collect pre-loaded data
The counts represent LFQ values in the log2 format and are not-imputed.
| column.id | sample.id | cell | condition | combined.id | replicate |
|---|---|---|---|---|---|
| Sample_A | BCa_FBS_rep1 | BCa | FBS | BCa_FBS | rep1 |
| Sample_B | BCa_6h.DMSO_rep1 | BCa | 6h.DMSO | BCa_6h.DMSO | rep1 |
| Sample_C | BCa_6h.10nM.E2_rep1 | BCa | 6h.10nM.E2 | BCa_6h.10nM.E2 | rep1 |
| Sample_D | BCa_FBS_rep2 | BCa | FBS | BCa_FBS | rep2 |
| Sample_E | BCa_6h.DMSO_rep2 | BCa | 6h.DMSO | BCa_6h.DMSO | rep2 |
| Sample_F | BCa_6h.10nM.E2_rep2 | BCa | 6h.10nM.E2 | BCa_6h.10nM.E2 | rep2 |
| Sample_G | BCa_FBS_rep3 | BCa | FBS | BCa_FBS | rep3 |
| Sample_H | BCa_6h.DMSO_rep3 | BCa | 6h.DMSO | BCa_6h.DMSO | rep3 |
| Sample_I | BCa_6h.10nM.E2_rep3 | BCa | 6h.10nM.E2 | BCa_6h.10nM.E2 | rep3 |
| Sample_J | BCa_FBS_rep4 | BCa | FBS | BCa_FBS | rep4 |
| Sample_K | BCa_6h.DMSO_rep4 | BCa | 6h.DMSO | BCa_6h.DMSO | rep4 |
| Sample_L | BCa_6h.10nM.E2_rep4 | BCa | 6h.10nM.E2 | BCa_6h.10nM.E2 | rep4 |
# log2(LFQ) values (not imputed)
data("unimputed.counts", package = "DEprot")
head(unimputed.counts[,1:6])| Sample_A | Sample_B | Sample_C | Sample_D | Sample_E | Sample_F | |
|---|---|---|---|---|---|---|
| protein.1 | 17.7830 | 18.3472 | 18.7880 | 16.4864 | 17.6476 | 17.7074 |
| protein.2 | 22.4649 | 22.3682 | 23.0818 | 21.3739 | 21.7369 | 21.8696 |
| protein.3 | 17.4850 | 16.9217 | 16.3817 | 17.7925 | 17.3193 | 17.4829 |
| protein.4 | 20.4191 | 20.5265 | 20.3660 | 20.4036 | 20.5500 | 20.6084 |
| protein.5 | 18.8366 | 18.8066 | 19.2993 | 17.8886 | 18.0043 | 18.6683 |
| protein.6 | 13.6840 | 12.7965 | 13.3525 | 13.0671 | 13.1504 | 13.1831 |
2.2 Build
DEprot object
Now we will combine the the counts and metadata to create a
DEprot object (hereafter defined as dpo).
Notice that another import information is whether the data are log
transformed, and if yes, which is the log base. Recommended
transformation is the log2(score + 1).
If data are pre-normalized and/or pre-imputed, it can be indicated
with the corresponding parameters.
If this is not the case, leave
the parameters as NA (not NULL!).
Ultimately, if the metadata table does not contain a
column.id column corresponding to the row names of the
counts_table, it is possible to indicate the name of
another column that should be assumed as to be the
column.id.
dpo <- load.counts(counts = unimputed.counts,
metadata = sample.config,
log.base = 2,
imputation = NA,
normalization.method = NA,
column.id = "column.id")
dpo
> DEprot object:
> Samples: 12
> Proteins: 13239
> Counts available: raw
> Log transformation: log2
> Metadata columns: column.id, sample.id, cell, condition, combined.id, replicateThis object is an S4-vector of class DEprot. The
S4-vectors are containers of slots that can be accessed using the symbol
@ (e.g., object@slot.id.
The structure of
an object of class DEprot (and
DEprot.analyses) is the following:
| Slot | Description |
|---|---|
| raw.counts | table containing the raw counts, if not available it
will be NULL |
| norm.counts | table containing the normalized counts, if not
available it will be NULL |
| imputed.counts | table containing the imputed counts, if not available
it will be NULL |
| log.base | a number indicating the base of the log used to
transform the table, if not available it will be NA |
| log.transformed | logical value indicating whether the data are log-transformed or not |
| imputed | logical value indicating whether the data are imputed or not |
| imputation | discussed further in the Imputation paragraph |
| normalized | logical value indicating whether the data are normalized or not |
| normalization.method | a string indicating the type of normalization applied,
if not available it will be NA |
| boxplot.raw | box+violin plot of the distribution of the LFQ intensities per sample obtained from the raw counts |
| boxplot.norm | box+violin plot of the distribution of the LFQ intensities per sample obtained from the normalized counts |
| boxplot.imputed | box+violin plot of the distribution of the LFQ intensities per sample obtained from the imputed counts |
| analyses.result.list | discussed further in the Differential Expression analyses paragraph |
| contrasts | discussed further in the Differential Expression analyses paragraph |
| differential.analyses.params | discussed further in the Differential Expression analyses paragraph |
2.3 Rename sample columns
As in our example, sometimes the columns of the counts are not the
actual ID of the samples, but rather and identifier. However, it is
possible to rename the counts column names indicating any column of the
metadata table (having unique values). The original identifiers are
stored in a new column (old.column.id) of the metadata.
Notice that the renaming will applied to all counts table available.
dpo <- rename.samples(DEprot.object = dpo,
metadata.column = "sample.id")
get.metadata(dpo)
> column.id sample.id cell condition combined.id
> 1 BCa_FBS_rep1 BCa_FBS_rep1 BCa FBS BCa_FBS
> 2 BCa_6h.DMSO_rep1 BCa_6h.DMSO_rep1 BCa 6h.DMSO BCa_6h.DMSO
> 3 BCa_6h.10nM.E2_rep1 BCa_6h.10nM.E2_rep1 BCa 6h.10nM.E2 BCa_6h.10nM.E2
> 4 BCa_FBS_rep2 BCa_FBS_rep2 BCa FBS BCa_FBS
> 5 BCa_6h.DMSO_rep2 BCa_6h.DMSO_rep2 BCa 6h.DMSO BCa_6h.DMSO
> 6 BCa_6h.10nM.E2_rep2 BCa_6h.10nM.E2_rep2 BCa 6h.10nM.E2 BCa_6h.10nM.E2
> 7 BCa_FBS_rep3 BCa_FBS_rep3 BCa FBS BCa_FBS
> 8 BCa_6h.DMSO_rep3 BCa_6h.DMSO_rep3 BCa 6h.DMSO BCa_6h.DMSO
> 9 BCa_6h.10nM.E2_rep3 BCa_6h.10nM.E2_rep3 BCa 6h.10nM.E2 BCa_6h.10nM.E2
> 10 BCa_FBS_rep4 BCa_FBS_rep4 BCa FBS BCa_FBS
> 11 BCa_6h.DMSO_rep4 BCa_6h.DMSO_rep4 BCa 6h.DMSO BCa_6h.DMSO
> 12 BCa_6h.10nM.E2_rep4 BCa_6h.10nM.E2_rep4 BCa 6h.10nM.E2 BCa_6h.10nM.E2
> replicate old.column.id
> 1 rep1 Sample_A
> 2 rep1 Sample_B
> 3 rep1 Sample_C
> 4 rep2 Sample_D
> 5 rep2 Sample_E
> 6 rep2 Sample_F
> 7 rep3 Sample_G
> 8 rep3 Sample_H
> 9 rep3 Sample_I
> 10 rep4 Sample_J
> 11 rep4 Sample_K
> 12 rep4 Sample_Lhead(dpo@raw.counts[,1:6])
> BCa_FBS_rep1 BCa_6h.DMSO_rep1 BCa_6h.10nM.E2_rep1 BCa_FBS_rep2
> protein.1 17.7830 18.3472 18.7880 16.4864
> protein.2 22.4649 22.3682 23.0818 21.3739
> protein.3 17.4850 16.9217 16.3817 17.7925
> protein.4 20.4191 20.5265 20.3660 20.4036
> protein.5 18.8366 18.8066 19.2993 17.8886
> protein.6 13.6840 12.7965 13.3525 13.0671
> BCa_6h.DMSO_rep2 BCa_6h.10nM.E2_rep2
> protein.1 17.6476 17.7074
> protein.2 21.7369 21.8696
> protein.3 17.3193 17.4829
> protein.4 20.5500 20.6084
> protein.5 18.0043 18.6683
> protein.6 13.1504 13.18313 Data normalization
When a DEprot object is loaded, automatically a
box/violin plot showing the distribution of the LFQ values per samples
is generated.
This representation is useful to estimate whether the
data are normalized or not.
Boxplots display the quantiles of the LFQ intensities, while red and blue dahsed lines correspond to maximum and minimum LFQ value for each sample.
In this package we apply the Modified Balanced Quantile Normalization
(MBQN) from the MBQN
package and developed by E.Brombacher et
al. (Proteomics, 2020). The modification balances the
median (or mean) intensity of features (rows) which are rank invariant
(RI) or nearly rank invariant (NRI) across samples (columns) before
quantile normalization. This prevents an over-correction of the
intensity profiles of RI and NRI features by classical quantile
normalization and therefore supports the reduction of systematics in
downstream analyses.
dpo <- normalize.counts(DEprot.object = dpo,
NRI.RI.ratio.threshold = 0.5,
balancing.function = "median")
dpo
> DEprot object:
> Samples: 12
> Proteins: 13239
> Counts available: raw, normalized
> Log transformation: log2
> Metadata columns: column.id, sample.id, cell, condition, combined.id, replicate, old.column.iddpo@normalization.method
> param value
> 1 package MBQN
> 2 method Quantile normalization
> 3 balanced TRUE
> 4 function median
> 5 NRI/RI ratio threshold 0.5head(dpo@raw.counts[,1:6])
> BCa_FBS_rep1 BCa_6h.DMSO_rep1 BCa_6h.10nM.E2_rep1 BCa_FBS_rep2
> protein.1 17.7830 18.3472 18.7880 16.4864
> protein.2 22.4649 22.3682 23.0818 21.3739
> protein.3 17.4850 16.9217 16.3817 17.7925
> protein.4 20.4191 20.5265 20.3660 20.4036
> protein.5 18.8366 18.8066 19.2993 17.8886
> protein.6 13.6840 12.7965 13.3525 13.0671
> BCa_6h.DMSO_rep2 BCa_6h.10nM.E2_rep2
> protein.1 17.6476 17.7074
> protein.2 21.7369 21.8696
> protein.3 17.3193 17.4829
> protein.4 20.5500 20.6084
> protein.5 18.0043 18.6683
> protein.6 13.1504 13.1831Also in this case a box/violin plot with the corresponding normalized LFQ values per each sample is generated and stored in a vector slot.
4 Data imputation
Often many NA/NaN values are present in the
LFQ tables due to the techanical limitaions of the protein detection by
Mass Spectrometry (MS) experiments.
Here we use the package missForest
package, developed by DJ.Stekhoven &
P.Buehlmann (Bioinformatics, 2012). This tool will impute
the NaN and assign and estimated value. It also yields an out-of-bag
(OOB) imputation error estimate (general, or per each sample). Moreover,
it can be run parallel to save computation time (both examples reported
here after).
## Without parallelization
dpo <- impute.counts(DEprot.object = dpo,
max.iterations = 100,
variable.wise.OOBerror = T,
use.normalized.data = T)
## With parallelization
dpo <- impute.counts(DEprot.object = dpo,
max.iterations = 100,
variable.wise.OOBerror = T,
use.normalized.data = T,
cores = 10,
parallel.mode = "variables")
dpo
dpo@imputation$OOBerror
data.frame(dpo@imputation[-3])> DEprot object:
> Samples: 12
> Proteins: 13239
> Counts available: raw, normalized, imputed
> Log transformation: log2
> Metadata columns: column.id, sample.id, cell, condition, combined.id, replicate, old.column.id
> BCa_FBS_rep1 BCa_6h.DMSO_rep1 BCa_6h.10nM.E2_rep1 BCa_FBS_rep2
> 0.1443367 0.1218259 0.1429739 0.1272787
> BCa_6h.DMSO_rep2 BCa_6h.10nM.E2_rep2 BCa_FBS_rep3 BCa_6h.DMSO_rep3
> 0.1249806 0.1260666 0.1691520 0.1310784
> BCa_6h.10nM.E2_rep3 BCa_FBS_rep4 BCa_6h.DMSO_rep4 BCa_6h.10nM.E2_rep4
> 0.1338480 0.1352660 0.1462996 0.1331815
| method | max.iterations | parallelization.mode | cores | processing.time |
|---|---|---|---|---|
| missForest | 100 | variables | 10 | 14.61 mins |
Also in this case a box/violin plot with the corresponding imputed LFQ values per each sample is generated and stored in a vector slot.
5 Sample similarities
5.1 Principal Component Analyses (PCA)
PCA can be performed in order to perform a dimensional reduction and
determine which factor explains the variability of the samples.
DEprot includes function dedicated to this aim and
specifically build to work with DEprot objects.
Notice that, even if the data are not log-transformed,
perform.PCA will do it before performing the analyses.
5.1.1 Compute PCs
## Perform the analyses (DEprot.PCA object)
PCA <- perform.PCA(DEprot.object = dpo,
which.data = "imputed") # possible: raw, normalized, imputedThe DEprot.PCA object contains the following slots:
| Slot | Description |
|---|---|
| PCA.metadata | metadata of the samples used in the PCA (subset of the
original DEprot@metadata) |
| sample.subset | vector containing the list of samples analyzed |
| data.used | vector indicating the type of counts used (imputed, normalized, raw) |
| prcomp | object of class prcomp corresponding to
the full PCA output |
| PCs | data.frame combining the PC scores and the
metadata table, useful for replotting |
| importance | statistical summary table for the PCA analyses per each PC |
| cumulative.PC.plot | ggplot object corresponding to out put of
plot.PC.cumulative for this object |
5.1.2 Visualize PCAs
## Plot cumulative variance of all PCs
#### equivalent to `PCA@cumulative.PC.plot`
plot.PC.cumulative(DEprot.PCA.object = PCA,
bar.color = "steelblue",
line.color = "navyblue")## Plot PC scatters
PC_1.2 <-
plot.PC.scatter(DEprot.PCA.object = PCA,
PC.x = 1,
PC.y = 2,
color.column = "condition",
shape.column = "replicate",
label.column = NULL,
plot.zero.lines = F) +
geom_hline(yintercept = 0, color = "gray", linetype = "dashed") +
theme(legend.position = "none")
PC_2.3 <-
plot.PC.scatter(DEprot.PCA.object = PCA,
PC.x = 2,
PC.y = 3,
color.column = "condition",
shape.column = "replicate",
label.column = NULL,
plot.zero.lines = T)
patchwork::wrap_plots(PC_1.2, PC_2.3, nrow = 1)5.1.3 Analyze PCs on a sample subset
These analyses can also be performed for a subset of samples by
indicated the sample names of interest.
In the example below we will
use only the sample in which the estrogen receptor is active (E2 and FBS
conditions).
## Perform the analyses (DEprot.PCA object)
PCA.fbs.e2 <-
perform.PCA(DEprot.object = dpo,
sample.subset = dpo@metadata$column.id[grepl("E2|FBS",
dpo@metadata$column.id)],
which.data = "imputed")
## Plot cumulative variance of all PCs
plot.PC.cumulative(DEprot.PCA.object = PCA.fbs.e2,
bar.color = "indianred",
line.color = "firebrick4",
title = "**Only ERa active**")## Plot PC scatters
PC.fbs.e2_1.2 <-
plot.PC.scatter(DEprot.PCA.object = PCA.fbs.e2,
PC.x = 1,
PC.y = 2,
color.column = "condition",
shape.column = "replicate",
label.column = NULL,
plot.zero.lines = F) +
geom_hline(yintercept = 0, color = "gray", linetype = "dashed") +
theme(legend.position = "none")
PC.fbs.e2_2.3 <-
plot.PC.scatter(DEprot.PCA.object = PCA.fbs.e2,
PC.x = 2,
PC.y = 3,
color.column = "condition",
shape.column = "replicate",
label.column = NULL,
plot.zero.lines = T)
patchwork::wrap_plots(PC.fbs.e2_1.2, PC.fbs.e2_2.3, nrow = 1)5.2 Correlations
Another method to define the sample clustering/groups, is the overall
correlation between the samples.
Hierarchical clustering is
performed using the 1 - correlation values, since the
hierarchical clustering algorithm is based on dissimilarities while the
correlations are an index of similarity.
corr.all.samples <-
plot.correlation.heatmap(DEprot.object = dpo,
which.data = "imputed",
palette = viridis::mako(n = 10, direction = -1, begin = 0.25),
correlation.scale.limits = c(0.9,1),
correlation.method = "pearson",
plot.subtitle = "All samples",
display.values = TRUE)
corr.all.samplesAlso in this case the sample correlation can be computed for a subset of samples as shown before for the PCAs.
corr.ERa.active <-
plot.correlation.heatmap(DEprot.object = dpo,
which.data = "imputed",
sample.subset = dpo@metadata$column.id[grepl("E2|FBS",
dpo@metadata$column.id)],
palette = viridis::magma(n = 10, direction = -1, begin = 0.25),
correlation.scale.limits = c(0.9,1),
correlation.method = "pearson",
plot.subtitle = "Only ERa active",
clustering.method = "complete",
display.values = TRUE)
corr.ERa.activeThe DEprot.correlation correlation object contains the
following slots:
| Slot | Description |
|---|---|
| heatmap | ggplot object corresponding to the
correlation heatmap |
| corr.metadata | metadata of the samples used in the correlation (subset
of the original DEprot@metadata) |
| sample.subset | vector containing the list of samples analyzed |
| data.used | vector indicating the type of counts used (imputed, normalized, raw) |
| corr.matrix | the correlation matrix on which the heatmap is base on |
| distance | object of class dist corresponding to the
output of as.dist(1 - correlation.matrix) |
| cluster | hclust object generated by
hclust(d = as.dist(1 - correlation.matrix), method = clustering.method) |
6 Differential Expression (DE) analyses
Differential expression analyse between two conditions can be
performed using the function diff.analyses.
The
conditions can be compared two-by-two (individual t-/wilcoxon tests). Is
is sufficient to provide a list of 3-elements vectors. The latter,
should indicate any column of the metadata table (grouping factor) and
two values (groups) to compare within this column. The first variable
will be numerator and the second the denominator of the fold change:
c("group.column", "condition.A", "condition.B"), FoldChange
= group A/B.
In this example we will compare 6h.10nM.E2 vs 6h.DMSO, and 6h.10nM.E2 vs FBS.
dpo_analyses <- diff.analyses(DEprot.object = dpo,
contrast.list = list(c("condition", "6h.10nM.E2", "6h.DMSO"),
c("condition", "6h.10nM.E2", "FBS")),
linear.FC.th = 2,
padj.th = 0.05,
padj.method = "bonferroni",
stat.test = "t.test",
which.data = "imputed")
dpo_analyses
> DEprot.analyses object:
> Counts used: imputed
> Fold Change threshold: 2 (linear)
> FC unresponsive range: [0.9090909,1.1] (linear)
> padj threshold: 0.05 (linear)
> padj method: bonferroni
>
>
> Differential results summary:
> contrast.id group.factor group1 group2 diff.status
> 1 condition: 6h.10nM.E2 vs 6h.DMSO condition 6h.10nM.E2 6h.DMSO 6h.DMSO
> 2 condition: 6h.10nM.E2 vs 6h.DMSO condition 6h.10nM.E2 6h.DMSO 6h.10nM.E2
> 3 condition: 6h.10nM.E2 vs 6h.DMSO condition 6h.10nM.E2 6h.DMSO unresponsive
> 4 condition: 6h.10nM.E2 vs 6h.DMSO condition 6h.10nM.E2 6h.DMSO null
> 5 condition: 6h.10nM.E2 vs FBS condition 6h.10nM.E2 FBS 6h.DMSO
> 6 condition: 6h.10nM.E2 vs FBS condition 6h.10nM.E2 FBS 6h.10nM.E2
> 7 condition: 6h.10nM.E2 vs FBS condition 6h.10nM.E2 FBS unresponsive
> 8 condition: 6h.10nM.E2 vs FBS condition 6h.10nM.E2 FBS null
> n median.FoldChange
> 1 0 NA
> 2 0 NA
> 3 9009 0.008525557
> 4 4230 0.152922231
> 5 0 NA
> 6 0 NA
> 7 9009 0.008525557
> 8 4230 0.152922231The summary can be collected by using the generic function
summary:
6.1 DE results
The output will be a DEprot.analyses object. This class
is similar to the base DEprot one, however 3 slots are now
available:
- contrasts: corresponds to the list used to define the contrasts, but includes also the IDs of the counts matrix belonging to each subgroup.
- differential.analyses.params: a list containing the core parameters used for the differential expression analyses.
- analyses.result.list: a list with an element for each contrast including all the results of the differential analyses (see below for details).
The analyses.result.list, for each contrast, stores a
list with the following elements:
| Element | Description |
|---|---|
| results | a data.frame containing the results of the analyses; includes average expression of each group, basemean, foldchange, pavalue and p.adj, differential.status |
| n.diff | a summary table showing the number of proteins in each differential expression status (up/down/unresponsive, null) |
| PCA.data | output of perform.PCA for the subset of
samples analyzed in a specific contrast |
| PCA.plots | combination of 3 plots: scatter PC1-vs-PC2, scatter PC2-vs-PC3, and cumulative bar plot |
| correlations | combination of Pearson and Spearman correlation
heatmaps (obtained by plot.correlation.heatmap) for the
subset of samples analyzed in a specific contrast |
| volcano | volcano plot showing the log2(FoldChange) x
-log10(p.adjusted) of differential expression results; it can be
regenerated using plot.volcano |
| MA.plot | MA-plot showing the log2(basemean) x log2(FoldChange)
of differential expression results; it can be regenerated using
plot.MA |
6.1.1 DE table
The table with the results of the differential analyses can be
retrieved directly from the list in the DEprot.analyses
object
(dpo_analyses@analyses.result.list$contrast.id$results) or
using the get.results function.
## Direct access
results = dpo_analyses@analyses.result.list$condition_6h.10nM.E2.vs.6h.DMSO$results
## Function
results = get.results(dpo_analyses, contrast = 1)
head(results)| prot.id | basemean.log2 | log2.mean.6h.10nM.E2 | log2.mean.6h.DMSO | log2.Fold_6h.10nM.E2.vs.6h.DMSO | p.value | padj | diff.status |
|---|---|---|---|---|---|---|---|
| protein.1 | 18.85442 | 18.91504 | 18.79380 | 0.1212396 | 0.5962212 | 1 | unresponsive |
| protein.2 | 23.07816 | 23.08739 | 23.06894 | 0.0184512 | 0.8416528 | 1 | unresponsive |
| protein.3 | 15.68530 | 15.65794 | 15.71266 | -0.0547179 | 0.8030525 | 1 | unresponsive |
| protein.4 | 21.14521 | 21.09930 | 21.19112 | -0.0918296 | 0.1360962 | 1 | unresponsive |
| protein.5 | 19.56671 | 19.62718 | 19.50624 | 0.1209374 | 0.1698674 | 1 | unresponsive |
| protein.6 | 15.31226 | 15.32551 | 15.29901 | 0.0264997 | 0.8228794 | 1 | unresponsive |
6.1.2 PCA and correlation within the comparison
The DEprot.analyses object includes PCA and correlation
analyses of the samples involved in the contrast.
6.1.3 Visualize DE analyses
Differential expressed proteins can be visualized as either a volcano
plot or an MA-plot.
Both these plots are available in the
dpo_analyses@analyses.result.list$contrast.id list, but can
also be generated using the functions plot.volcano and
plot.MA.
Of not, if use.uncorrected.pvalue = TRUE, the normal
p-value will be used instead of the p.adjusted. In this case the
FoldChange and p-value thresholds are collected from the
DEprot.analyses object and reapplied to compute the new
differential status of the proteins.
volcano = plot.volcano(dpo_analyses, contrast = 1, use.uncorrected.pvalue = TRUE)
MAplot = plot.MA(dpo_analyses, contrast = 1, use.uncorrected.pvalue = TRUE)
patchwork::wrap_plots(volcano, MAplot)7 Package information
7.1 Documentation
With the package a detailed PDF manual with details for each function and respective parameters is available.
The R-package have been published on GitHub and a git-pages website is available as well. At both these sites it is possible to find the installation procedure, required dependencies, and the links for changeLog, manual and vignette.
7.2 Package history and releases
A list of all releases and respective description of changes applied could be found here.
7.3 Contact
For any suggestion, bug fixing, commentary please contact:
Sebastian Gregoricchio sebastian.gregoricchio@nki.nl